The Tanl tagger for Named Entity Recognition on Transcribed Broadcast News
نویسندگان
چکیده
The Tanl tagger is a configurable tagger based on a Maximum Entropy classifier, which uses dynamic programming to select the best sequences of tags. We applied it to the NER tagging task, customizing the set of features to use, and including features deriving from dictionaries extracted from the training corpus. The final accuracy of the tagger is further improved by applying simple heuristic rules.
منابع مشابه
The Tanl Tagger for Named Entity Recognition on Transcribed Broadcast News at Evalita 2011
The Tanl tagger is a flexible sequence labeller based on Conditional Markov Model that can be configured to use different classifiers and to extract features according to feature templates expressed through patterns provided in a configuration file. The Tanl Tagger was applied to the task of Named Entity Recognition (NER) on Transcribed Broadcast News of Evalita 2011. The goal of the task was t...
متن کاملNamed entity extraction from Japanese broadcast news
This paper describes a method for named entity extraction from Japanese broadcast news. Our proposed named entity tagger gives entity categories for every character in order to deal with unknown words and entities correctly. This character-based tagger has models designed by maximum entropy modeling. We discuss the efficiency of the proposed tagger by comparison with a conventional word-based t...
متن کاملEVALITA 2011: Description and Results of the Named Entity Recognition on Transcribed Broadcast News Task
This report describes features and outcomes of the Named Entity Recognition on Transcribed Broadcast News task at EVALITA 2011. This task represented a change with respect to previous editions of the NER task within the EVALITA evaluation campaign because it was based on automatic transcription of broadcast news. Four participants took part in the task and submitted a total of 9 runs. In this p...
متن کاملNamed Entity Recognition in Broadcast News Using Similar Written Texts
We propose a new approach to improving named entity recognition (NER) in broadcast news speech data. The approach proceeds in two key steps: (1) we detect block alignments between highly similar blocks of the speech data and corresponding written news data that are easily obtainable from the Web, (2) we employ term expansion techniques commonly used in information retrieval to recover named ent...
متن کامل1998 Hub-4 Information Extraction Evaluation
This paper documents the Information Extraction Named-Entity Evaluation (IE-NE), one of the new spokes added to the DARPA-sponsored 1998 Hub-4 Broadcast News Evaluation. This paper discusses the information extraction task as posed for the 1998 Broadcast News Evaluation. This paper reviews the evaluation metrics, the scoring process, and the test corpus that was used for the evaluation. Finally...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011